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ABSTRACT 

It is argued that given the importance and the increased use of 
multivariate techniques such as factor analysis and canonical correlation, 
students need to be made aware of multivariate methods and the appropriate 
ways in which they can be applied. As a general linear moe el that subsumes 
all other parametric methods, canonical correlation analysis provides a 
natural framework for instruction involving all the various parametric 
procedures (e.g., ANOVA, ANCOVA). Furthermore, when canonical 
correlation analysis is used as an instructional tool, students gain an 
understanding of how all parametric procedures are special cases of canonical 
correlation analysis, that all parametric procedures involve the application of 
weights to derive synthetic scores, and that all parametric procedures are 
correlational, thus yielding a measure of effect important to the interpretation 
of one's results. A small heuristic data set is employed to demonstrate how 
canonical correlation analysis can be used as an instructional device in 
teaching both univariate and multivariate parametric methods. 
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Canonical Correlation Analysis: 
An Instructional Tool for All Parametric Statistical Procedures 

Traditionally, graduate level courses in research methodology have 
focused primarily on univariate procedures. As research by Willson (1982) 
indicates, until the 1970's textbooks in the field emphasized analysis of 
variance (ANOVA) methods. Following Cohen's (1968) seminal article on 
linear regression as a general linear model, textbooks, such as Kerlinger and 
Pedhazur's (1973) text, stressed the application of regression techniques. This 
also led to extensive application of regression analyses, as reported by Willson 
(1980) in a review of a decade of research. More recently, researchers have 
noted an increase in the application of multivariate techniques, such as factor 
analysis and canonical correlation analysis (Goodwin & Goodwin, 1985; 
Thompson, 1989a). While these techniques have existed for some time, this 
slow trend towards an inaease in the use of multivariate procedures can be 
attributed to the incorporation of such methods into major statistical 
computer packages (Krus, Reynolds, & Krus, 1976), which removed associated 
problems of mathematical complexity involved in calculation by hand. 

Many have stressed the importance of multivariate techniques and 
their advantages over traditional univariate methods (Campo, 1990; Fish, 
1988; Kerlinger, 1986). The strongest argument in favor of multivariate 
techniques has been summarized by LaGacda (1991) as follows: 

Researchers who believe that most outcomes are multiply 
caused, and that most interventions have multiple outcomes, 
simply must use multivariate analyses, or risk the seriously 
incorrect interpretations that can directly result from the failure 
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to use analytic methods that honor a view of reality presuming 

that reality is complex (p.l53). 
Given the importance and the increased use of such techniques, students 
need to be made aware of multivariate methods and the appropriate ways in 
which they can be applied. As Kerlinger (1986) states, "one cannot conceive of 
modern behavioral research without also recognizing the necessity for 
students of research to study these admittedly difficult yet indispensable 
approaches to research problems" (p. ix). 

Multivariate methods do involve complex, restrictive mathematical 
manipulations. The mathematical training required to implement these 
procedures by hand, however, is typically beyond the scope of most graduate 
programs within the behavioral sciences. Therefore, if students are to acquire 
a basic conceptual understanding of multivariate methods, a heuristic 
framework that does not require an extensive mathematical background is 
essential. Such a framework is inherent in the multivariate procedure 
referred to as canonical correlation analysis. 

Researchers have for some time recognized that canonical correlation 
analysis, not regression analysis, is the most general linear model that 
subsumes all other parametric procedures (Baggaley, 1981; Fornell, 1978; 
Knapp, 1978). As such, it provides a natural instructional tool for all 
parametric methods, both univariate and multivariate. Knapp (1978) and 
others (Campo, 1990; Thompson, 1985) have detailed how canonical 
correlation analysis will produce the same results as other parametric 
methods. The purpose of the present paper is to demonstrate in concrete, 
mathematically simple terms, how canonical correlation analysis can be 
employed as an instructional device for teaching research methodology. This 
discussion will include a brief review oi the basics of canonical correlation 
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analysis, its value as an instructional tool, and its advantages over other 
statistical procedures. 

The Basics of Canonical Correlation Analysis 
In general, canonical correlation analysis is a method for investigating 
the relationship between two sets of variables, a set of dependent variables 
and a set of independent variables, where each set contains two or more 
variables (Thompson, 1984). Simply put, it involves the calculation of a set of 
weights for each group of variables which, when applied, yields a linear 
composite, or synthetic score, for each set. These weights are derived such 
that the bivariate correlation between the pairs of composite scores is 
maximized. This bivariate correlation is the canonical correlation, R^, which 
can be squared to obtain an estimate of the variance shared by the composite 
scores. If the first set of variables contains p variables anJ the second set has q 
variables, where q is less than or equal to p, then a total of q-1 linear 
combinations are possible, and that each set of composite scores will be 
perfectly uncorrected with all previously derived composites (Cooley & 
Lohnes, 1971; Stevens, 1986; Thompson, 1984). For a more detailed discussion 
of canonical correlation analysis and its interpretation, the reader is referred 
to Thompson's (1984) treatment of the method. 

Instructional Value of Canonical Correlation Analysis 
Thompson (1984) describes to canonical correlation analysis using the 
framework of the very familiar bivariate technique as the resulting canonical 
correlation is the bivariate correlation coefficient. This approach to canonical 
correlation analysis is attractive instructionally "because most students feel 
comfortable working with bivariate correlation coefficients" (Thompson, 



1987, p.3). Furthermore, it aids students in gaining "important insights 
regarding the relatedness of all parametric methods" (Campo, 1990, p.9). 
Campo (1990) discusses three such "insights" that contribute to the value of 
canonical correlation analysis as an instructional tool. First, since canonical 
correlation analysis subsumes all other parametric methods, all such methods 
can be considered special cases of canonical correlation analysis. As such, 
canonical correlation analysis can be applied to perform any parametric 
analysis. The obverse is not true, however; lhat is, canonical correlation 
analysis can not be performed using less sophisticated methods (Campo, 
1990). 

Use of canonical correlation analysis as a heuristic framework also 
enables the student to see how all parametric methods apply weights to create 
synthetic scores. Furthermore, it is these synthetic scores that is the focus of 
all analyses (Campo, 1990; Thompson, 1987). 

Finally, a bivariate approach to canonical correlation analysis 
demonstrates that all parametric methods are correlational and, as such, yield 
a measure of effect size analogous to r2. Thompson (1989b) emphasizes the 
importance of interpreting effect size estimates with all analyses in order to 
gain an understanding of the importance of one's results. 



Canonical Correlation Analysis as a Heuristic Framework 
Before examining how canonical correlation analysis yields the same 
results as other parametric procedures, a comment concerning some of the 
various statistics reported by the different methods is in order. Students are 
familiar with the F statistic reported by most univariate procedures, and in 
particular, in ANOVA techniques. They are not, however, as familiar with 
the various test statistics that are reported with many multivariate 



procedures. In canonical correlation analysis, a particular test statistic of 
interest, other than the canonical correlation Re, is Wilk's lambda. As Glass 
and Stanley (1970) have pointed out, all test statistics, such as Z, t, chi-square 
and F, are related. Although the relationship between F and Wilk's Lambda 
is not a direct one, Rao (1952) has provided a formula for converting the 
lambda resulting from a canonical correlation analysis to a value whose 
distribution approximates the F distribution. When either p or q is less than 
or equal to two, which is the case when canonical correlation analysis is 
applied in place of many univariate procedures, this conversion is exact 
(Knapp, 1978) and simplifies to (Cooley & Lohnes, 1971; Thompson, 1985): 

F = 1 - lambda » df error 
lamda df effect 

Presented in Table 1 is a small heuristic data set employed to 
demonstrate how canonical correlation analysis produces the same results as 
other univariate and multivariate methods. The data set contains two 
continuous dependent variables, Y and X, and two independent variables, a 
discrete variable. A, which might represent experimental groupings or 
categories, and a continuous variable, B. The discrete variable B' is a 
dichotomization of the variable B, created by collapsing B in a manner similar 
to how researchers often treat aptitude variables when investigating an 
aptitude-treatment interaction effect through an ANOVA design. Note that 
both sets contain two variables, the minimum requirement for performing a 
true canonical correlation analysis. 

As many researchers have noted, in order to apply canonical 
correlation analysis in place of some parametric methods, specifically various 
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ANOVA techniques, some form of contrast coding must be employed 
(Campo, 1990; Knapp, 1978; Thompson, 1985). A review of the various 
methods of contrast coding is beyond the scope of this discussion; the 
interested reader is referred to Pedhazur (1982), who provides an excellent 
elaboration of such coding techniques. For the present discussion, included 
with the data set in Table 1 is the contrast coding for both of the independent 
variables A and B'. Appendix A contains the SAS program statements to 
perform all of the analyses discussed below. The statements for creating the 
contrast coding are included. 

Insert Table 1 about here. 



Table 2 compares the Pearson product moment correlation between X 
and B to the canonical correlation analysis between the same variables. As 
noted previously, Rc is the bivariate correlation coefficient between the two 
composite ccores derived through canonical correlation analysis. Since 
multiplicative constants applied to variables have no impact whatsoever on 
correlations between the variables, the r between the variables is also the 
canonical Rc between the variables after weighting by the canonical 
coefficients to transform the observed variables into latent composite scores. 
In this case, each set contains only one variable, that is, p=q=l. Thus, the two 
sets of variables cannot be reduced any further and the canonical correlation 
Rc is equivalent to the bivariate correlation r. Note that while Rc can never 
be negative, the magnitude between Rc and r will always be the same. 

Insert Table 2 about here. 
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Table 3 demonstrates how canonical correlation analysis and t-test 
analysis yield the same results. A t-test analysis was performed for the 
dependent variable Y with the independent variable B'. Recall that the 
squared value a t statistic with n degrees of freedom is equivalent to an F 
statistic with 1 and n degrees of freedom (Glass & Stanley, 1970). Thus, using 
Rao's conversion to calculate F, it is evident that the two procedures are 
equivalent. 

Insert Table 3 about here. 



The conventional ANOVA summary table for a 2 X 2 factorial analysis 
on the dependent variable Y and the factors A and B' is presented in Table 4. 
To conduct a two factor ANOVA through canonical correlation analysis, four 
separate canonical correlation analyses are required. The first analysis 
includes all of the contrast variables, Al, A2, Bl, A1B2, A2B2, representing 
both factors and their interaction (i.e., all possible effects). Each of the 
remaining three analyses excludes the specific contrast variables associated 
with a specific effect. The resulting lambda's from these analyses are 
contained in table 5, 

Insert Tables 4 and 5 about here. 



As Thompson (1985) notes, Wilk's lambda is analogous to the sums of 
squares (SOS) within or error in conventional oneway ANOVA, that is, 
lambda = 1 - (SOSBetween/SOSjotal) or (SOSError/SOSiotal). Both are 
estimates of effect, but, whereas SOS gets larger as an effect increases, lambda 
gets smaller. 
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As a measure of effect, lamda estimates the effect of all those variables 
included in its calculation. The effect of a particular variable, as measured by 
lambda, can be determined by partitioning out the effect of the other variables 
from the overall lambda, which measures the effect of all the variables. The 
calculations yielding the individual lambda's for each effect are presented in 
Table 6. The individual lambda's may then be converted into F statistics by 
applying Rao's (1952) conversion as they are in Table 7. Comparison of the 
values in Tables 4 and 7 demonstrates that factorial ANOVA and CCA yield 
equivalent F statistics. 

Insert Tables 6 and 7 about here. 



It is also of note that 1 - lambda, where the lambda of interest is the 
multivariate lambda of the full model, is equal to the squared canonical 
correlation coefficient Rc for the full model (Thom.pson, 1988). For this 
example, 1 - 0.05755 = 0.94245, which is equal to Rc for the canonical 
correlation analysis and eta^ for the factorial ANOVA, providing further 
evidence that canonical correlation does subsume ANOVA. 

An explanation of how canonical correlation also subsumes factorial 
MANOVA follows directiy from the example on factorial ANOVA. The 
results of a factorial MANOVA for the dependent variables X and Y with the 
variables A and B' and the corresponding canonical correlation analysis are 
presented in Tables 8 through 10. As Thompson (1985) notes, the 
calculation's are simplified since MANOVA results are reported in ti\e form 
of lambda's. 
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Insert Tables 8, 9 and 10 about here. 



Results of a multiple linear regression analysis of X and B on Y and the 
corresponding canonical correlati .*4 analysis is presented in Table 11. Note 
that in multiple regression, the multiple R squared is the squared correlation 
coefficient between the predictor score, Y, and the composite score, Yhat. 
Thus, it follows logically that the multiple R squared of multiple regression is 
equivalent to the squared canonical coefficient resulting from canonical 
correlation analysis. Furthermore, although it is not so readily apparent, the 
regression beta weights are related to the function coefficients generated 
through canonical correlation analysis. Thompson and Borrello (1985) 
provide a derailed discussion of how the two sets of coefficients are equated 
through a variance adjustment applying either Rc or R. Table 12 
demonstrates this relationship for the current example. 

Insert Tables 11 and 12 about here. 



The relationship between canonical correlation analysis and 
discriminant analysis has been previously demonstrated by Tatsuoka (1989) 
and others (Diuiieman, 1984; Xitao, 1992). The objective of discriminant 
analysis is the prediction of group membership on the basis of some set of 
scores. For this example, Table 13 presents the results of a discriminant 
analysis with the variables X and Y to predict membership for the variable A. 
Also included in Table 13 is the results of a canonical correlation analysis for 
X and Y with the contrast variables Al and A2 representing the levels of A. 
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Insert Table 13 about here. 



Again, the results are equivalent, except for the function coefficients. 
Equivalence of the function coefficients can be demonstrated by setting the 
largest coefficient to one (Tatsuoka, 1989), as is shown in Table 13. Although 
this method clearly demonstrates that the two sets of coefficients have the 
same ratio, the relationship between them is not clear. A more explicit 
description of the relationship be'.ween the two sets of variables has been 
provided by Xitao (1992). Similar to the comparison between multiple 
regression and canonical correlation analysis, the relationship between the 
resulting two sets of function coefficients can be demonstrated through a 
variance adjustment involving the pooled within-group covariance matrix 
(Xitao, 1992). Basically, the relationship between the two sets of function 
coefficients is: 

ao = 

v ac' Spooled ac 

where a^ is the vector of function coefficients from the discriminant analysis, 
ac is the vector of function coefficients from the canonical correlation 
analysis, and Spooled is the pooled within-group covariance matrix of the 
original predictor variables (Xitao, 1992). For this example, correspondence 
between the two sets of function coefficients for the first function is presented 
in Table 14. 



Insert Table 14 about here. 



12 



A concrete example of how canonical correlation analysis subsumes 
factor analysis, specifically the principal components method, has not been 
included as a part of this discussion. However, some general comments 
coi.cerning this issue can be made. First, it should be noted that both 
canonical correlation analysis and principal components are variable 
reduction techniques (Stevens, 1986). Both methods reduce a set of variables 
into a set of synthetic scores which contains all or most of the variance of the 
original variables. Whereas in canonical co?Telation analysis, the objective is 
to derive a set of scores such that the correlation between the two sets of 
variables is maximized, in principal components the objective is to maximize 
the correlation within a single set of variables (Campo, 1990). Thus, principal 
components analysis may be thought of as the case where there is only one set 
of variables instead of two. 



Advantages of Canonical Correlation Analysis 
Believing that most of the phenomenon that is of interest in the 
behavioral sciences have multiple causes and outcomes, researchers typically 
measure several different but related variables. Having been trained to apply 
traditional univariate methods, researchers will often conduct multiple tests 
within a single study. There are several problems with multiple univariate 
tests, however, that can be avoided through the application of multivariate 
procedures, including canonical correlation analysis. 

First, the use of multiple univariate tests disregards the variance that is 
shared between the multiple dependent and multiple independent variables 
that exists in reality (Thompson, 1984). Furthermore, ANOVA techniques 
discard even more variance by requiring that all independent variables be 
scaled at the nominal level of measurement. As a result, the reality the 
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researcher wishes to generalize to is distorted. Canonical correlation analysis, 
however, allows for variables of any level of measurement and was designed 
to examine multiple variables simultaneously (Thompson, 1984). Thus, the 
reality the researcher has strived to represent by collecting multiple measures 
is preserved. 

The second problem with multiple univariate tests concerns the 
probability of committing a Type I error. As the number of hypotheses 
within a study increases, the experimentwise error rate, that is, the 
probability that one or more Type I errors in a study as a whole has occurred, 
inflates (Thompson, 1988). This problem can be alleviated through the 
application of multivariate techniques such as canonical correlation analysis 
where fewer hypotheses or a single hypothesis is tested. And finally, as both 
Fish (1988) and Thompson (1986) have demonstrated, by employing several 
univariate tests one may fail to find statistically significant results that are 
present when a multivariate test is employed. 

Summary 

Given the multivariate nature of reality, it is imperative for students of 
research in the behavioral sciences to become familiar with the various 
multivariate statistical procedures that are readily available through the use 
of computers. Although such techniques are mathematically complicated, 
the use of canonical correlation analysis as a heuristic framework enables 
students to gain a deeper conceptual understanding of all parametric 
methods, both univariate and multivariate. Canonical correlation analysis 
(Thompson, 1991) employed as an instructional tool demonstrates (a) how all 
parametric procedures are special cases of canonical correlation analysis, (b) 
that all parametric procedures involve the application of weights to derive 
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synthetic scores, and (c) that all parametric procedures are correlational, thus 
yielding a measure of effect important to the interpretation of one's results. 
Furthermore, the application of multivariate techniques such as canonical 
correlation analysis can overcome serious problems associated with the use of 
multiple univariate tests. This is not to imply that all analyses should be 
carried out through canonical correlation analysis, but that students should be 
made aware of the various procedures that are available such that they are 
able to apply the appropriate method in their own research and provide a 
more accurate representation of a complex reality. 
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Table 1 



Hypothetical Data Set with Contrast Coding for Heuristic Demonstration 
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Table 2 

Pearson Product Moment Correlation through CCA (X with B) 



CCA 

Squared Rc 
Rc 

lambda 

F 

df 



0.20206 
0.44951 
0.79794 
4.052 
1/16 
0.0613 



Pearson Correlation 



■0.44951 




0.0613 



21 



20 



I 



Table 3 

i-test Analysis through CCA [Y by B' (1,2)] 

CCA t-test Analysis 



Squared Rc 0.63661 Mean of group 1 6.33 

Rc 0.40528 Sd 2.3452 

lambda 0.59472 Mean of group 2 10.67 

Sd 3.1623 

t -3.3020 

P 10.9032 t2 10.9032 

df 1/16 df 16 

P 0.0045 p 0.0045 



Table 4 



Factorial ANOVA [Y by A (U), B' (1.2)] 



Source 


SOS 


df 


MS 


Fcalc 


A 


108.00 


2 


54.00 


54.00 


B' 


84.50 


1 


84.50 


84.50 


AB' 


4.00 


2 


2.00 


2.00 


Error 


12.00 


12 


1.00 




Total 


208.00 


17 







eta2 =^196.5/208.5 = 0.94245 
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Table 5 

Canonical Analysis of Four Models 

Model Predictors of Y Lambda 

1 Al A2 Bl AlBl A2B1 0.05775 

2 Bl AlBl A2B1 0.57554 

3 A1A2 A1B1A2B1 0.46283 

4 Al A2 Bl 0.07674 



Table 6 

Conversion to ANOVA Lambda's 



Source 

A 
B' 
AB' 



Models 

1/2 
1/3 
1/4 



Calculation Lambda 



0.05755/0.57554 
0.05755/0.46283 
0.05755/0.07674 



0.09999 
0.12434 
0.74993 



Table 7 

Conversion of Lambda's to ANOVA F's 


Source 


[(I - lambda)/lambda]*[df error/df effect] 


= Fcalc 




A 


[(1 - 0.09999)/0.09999] » [12/2] 


= 54.00 




B' 


[(1 - 0.12434)/0.12434] » [12/1] 


= 84.50 




AB' 


[(1 - 0.74993)/0.74993] » [12/2] 


= 2.00 




Rc2 = l 


- lambda = 1 - 0.05775 = 0.94245 = eta2 







Table 8 



Factorial MANOVA [Y, X with A (1,3), B' (1,2)] 



Source 


Lambda 


Fcalc 


df 


P 


A 


0.04133 


21.56 


4/22 


0.0001 


B' 


0.08247 


61.19 


2/11 


0.0001 


AB' 


0.74275 


0.88 


4/22 


0.49 



Table 9 



Canonical Analysis of Four Models 



Model 


Predictors of Y and X 


Lambda 


1 


Al A2 Bl AlBl A2B1 


0.01802 


2 


Bl AlBl A2B1 


0.43614 


3 


Al A2 AlBl A2B1 


0.21854 


4 


Al A2 Bl 


0.02427 



Table 10 

Conversion to MANOVA Lambda's 



Source 

A 
B' 
AB' 



Models 

1/2 
1/3 
1/4 



Calculation 

0.01802/0.43614 
0.01802/0.21854 
0.01802/0.02427 



Lambda 

0.04132 
0.08246 
0.74248 



Table 11 



Multiple Regression through CCA [Y with X and B] 


CCA 




Regression Analysis 


Squared Rc 


073638 


Squared R 


0.7364 


Rc 


0.85813 


R 


0.85813 


lambda 


0.26362 






F 


20.95 


F 


20.95 


df 


2/15 


df 


2/15 


P 


0.0001 


P 


0.0001 








Table 12 






Function Coefficient and Beta Weight Conversions 




Function 


Beta 


Function 


Predictor 


Coefficient » Rc (or R) 


= Weight / Rc (or R) = 


= Coefficient 


X 


-0.7852 » 0.85813 


= -0.67384/0.85813 


= -0.7852 


B 


0.3598 » 0.85813 


= 0.38071/0.85813 


= 0.3598 



ERIC 
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Table 13 

Discriminant Analysis through CCA [A with X and Y] 



CCA Discriminant Analysis 



Squared Rc 


0.75284 




Rc 


0.86766 


Rc 0.86766 


lan.i)da 


0.24494 




F 


7.1438 


F 7.1438 


df 


4/28 


df 4/28 


P 


0.0004 


p 0.0004 


Raw Function Coefficients 


CCA: 


Function I 


Discriminant: Function I 


X 0.39696- 


-> 0.3%96/0.39696 = 1 


X 0.75003 — > 0.75003/0.75003 = 1 


Y -0.0125- 


-> -0.0125/0.39696 = -0.03148 


Y -0.02361 >-0.02361/0.75003 = -0.03148 



Table 14 

Relationship Between Canonical and Discriminant Function Coefficients 



'pooled 



1.6556 -1.8333 
-1.8333 6.70 



v'ac' Spooled ac = [0.39696 -0.01250] 



/^c' Spooled 0.52956 



0.39696 
-0.0125 



1.6556 -1.8333 
•1.8333 6.70 



0.75003 
■0.02361 



0.39696 
1-0.01250 



= 0.52956 
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Appendix A 

DATA Dl; INFILE CAN; 

INPUTYXAB; 

IFA = 1 THEN Al = 1; 

ELSEIFA = 2THENA1 = -1; 

ELSEIFA = 3THENA1 =0; 
IF A = 1 OR A = 2 THEN A2 = -1; 

ELSE IF A = 3 THEN A2 = 2; 
IFB<5THENBB = 1; 

ELSEIFB>5THENBB = 2; 
IFBB=1THENB1 = 1; 

ELSE IF BB = 2 THEN Bl = -1; 
AlBl = A1*B1; 
A2B1 = A2*B1; 
PROCSORT; 

BYABBB; 
PROC PRINT; 

VAR Y X A B BB Al A2 Bl AlBl A2B1; 

TITLE 'RAW DATA SET WITH CONTRAST CODING'; 
PROCOORR; 

VARXB; 

TITLE 'CORRELATION OF PREDICTOR AND CRITERION VARL\BLE'; 
PROC CANCORR SIMPLE CORR; 
VAR X; 
WITHB; 

TITLE 'CCA SUBSUMES PEARSON CORRELATION; 
PROCTIEST; 
CLASS BB; 
VAR Y; 

TITLE T TEST FOR DEP VAR Y AND INDEP VAR B'; 
PROC CANCORR SIMPLE CORR; 
VAR Y; 
WITH Bl; 

TITLE 'CCA SUBSUMES T TEST: INDEP VAR CONTRAST CODING'; 
PROCANOVA; 
CLASS A BB; 
MODEL Y=A BB A*BB; 

TITLE 'ANOVA WITH DEP VAR Y AND INDEP VARS A AND B'; 



Note: BB refers to the contrast variable B' in the text as B' is not a valid 
variable name in SAS. 



ERIC ,n 2 6 



PROC CANCORR SIMPLE CORR; 
VAR: Y; 

WITH Al A2 Bl AlBl A2B1; 
TITLE 'CCA SUBSUMES FACTORIAL ANOVA'; 
PROC CANCORR SIMPLE CORR; 
VAR Y; 

WITHBl AlBl A2B1; 

TITLE 'CCA SUBSUMES FACTORIAL ANOVA'; 
PROC CANCORR SIMPLE CORR; 
VAR Y; 

WITH Al A2 AlBl A2B1; 
TITLE 'CCA SUBSUMES FACTORIAL ANOVA'; 
PROC CANCORR SIMPLE CORR; 
VAR Y; 

WITH Al A2B1; 

TITLE 'CCA SUBSUMES FACTORIAL ANOVA'; 
PROC CANCORR SIMPLE CORR; 
VARY; 
WITHXB; 

TITLE 'CCA SUBSUMES MULTIPLE REGRESSION'; 
PROC ANOVA; 

CLASS A BB; 

MODEL Y X=A BB A*BB; 

MANOVA H=^ALLJSUMMARY; 

TITLE 'FACTORL\L MANOVA'; 
PROC CANCORR SIMPLE CORR; 

VAR Y X; 

WITH Al A2 Bl AlBl A2B1; 
TITLE 'CCA SUBSUMES FACTORIAL MANOVA'; 
PROC CANCORR SIMPLE CORR; 
VAR Y X; 

WITH Bl AlBl A2B1; 

TITLE 'CCA SUBSUMES FACTORIAL MANOVA'; 
PROC CANCORR SIMPLE CORR; 
VAR Y X; 

WITH Al A2 AlBl A2B1; 
TITLE 'CCA SUBSUMES FACTORIAL MANOVA'; 
PROC CANCORR SIMPLE CORR; 
VAR Y X; 
WITH Al A2 Bl; 

TITLE 'CCA SUBSUMES FACTORIAL MANOVA'; 



PROCREG; 
MODEL Y=XB/STB; 

TITLE 'MULTIPLE REGRESSION OF X AND B ON V ; 
PROC DISCRIM SIMPLE WCOV WCORR PCOV PCORR; 
VAR X Y; 
CLASS A; 

TITLE 'DISCRIMINANT ANALYSIS'; 
PROC CANDISC ALL; 

VAR X Y; 

CLASS A; 
PROC CANCORR SIMPLE CORR; 

VAR Al A2; 

WITHX Y; 

TITLE 'CCA SUBSUMES DISCRIMINANT ANALYSIS'; 



